Closed
Conversation
…nTicket records SSL_write can return SSL_ERROR_WANT_READ in TLS 1.3 when pending NewSessionTicket records need to be drained before application data can be sent. The old code treated any rc<=0 from SSL_write as fatal, causing spamc to silently fail to send the SPAMC protocol header and hang until spamd's 30-second timeout fired. Also fixes ssl_timeout_read's retry loop, which checked errno==EWOULDBLOCK instead of SSL_get_error()==SSL_ERROR_WANT_READ — the wrong error mechanism for OpenSSL. Changes: - spamc/libspamc.h: add SPAMC_DEBUG_SSL (1<<11) flag - spamc/spamc.c: add -D/--ssl-debug flag to trace SSL state to stderr - spamc/utils.c: fix ssl_timeout_read retry to use SSL_get_error() - spamc/libspamc.c: add _ssl_write_with_retry() helper that loops on WANT_READ/WANT_WRITE; add debug tracing after SSL_connect, around SSL_write, and on ssl_timeout_read failures Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
SSL_write(ssl, buf, 0) returns 0 with SSL_ERROR_SYSCALL, which the error-checking code treated as fatal. The non-SSL full_write path handles zero-length writes as a harmless no-op; match that behavior by skipping the SSL body write entirely when towrite_len == 0. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
|
This does sound like a serious bug. I recommend that you also open a bug in the SpamAssassin bugzilla and reference your final PR here containing the fixes. |
Author
|
Hey John, I refactored this to be on a different branch in my repo fork, and this was reopened as pull request #26 . I've created the bugzilla request as you suggested. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Spamc has several bugs that cause TLS1.3 to not work, which is the default version that will be negotiated (spamd listens on TLS1.3 because it just does whatever IO:Socket:SSL supports).
I (not the LLM) diagnosed this by finding random disconnects when trying to do spamc -K -S connecting to an SSL-enabled spamd. Spamd would only log:
Apr 2 21:03:17 post spamd[89898]: prefork: child states: II
Apr 2 21:03:18 post spamc[89911]: SSL write failed (5)
After adding a bunch of debug prints to spamd, and forcing spamd to not do TLS1.3, I attempting connect with openssl s_client, which worked, but spamc did not.
The three bugs present in the current code are:
Bug 1: ssl_timeout_read retry loop checks wrong error mechanism spamc/utils.c — retry loop checked errno == EWOULDBLOCK instead of SSL_get_error() == SSL_ERROR_WANT_READ. OpenSSL uses its own error queue, not errno, so the retry never fired.
Bug 2: SSL_write not retried on SSL_ERROR_WANT_READ spamc/libspamc.c — In TLS 1.3, the server sends post-handshake NewSessionTicket records after the handshake completes. SSL_write can return SSL_ERROR_WANT_READ while these are pending. The original code treated any rc <= 0 from SSL_write as a fatal error with no retry.
Bug 3: SSL_write(ssl, buf, 0) treated as fatal error spamc/libspamc.c — For commands with no body (e.g. PING / -K), towrite_len == 0. Calling SSL_write with length 0 returns 0, which the rc <= 0 check treated as failure. The non-SSL full_write path handles zero-length writes as a no-op.
This code also adds a -D argument to spamc so that future SSL connect issues may be debugged (not recommended for normal use), because doing so with truss/strace is painful.
Tested via both:
spamc/spamc -S -D -l -d localhost -p 784 < t/data/spam/001 (actual message test)
spamc/spamc -S -D -l -K -d localhost -p 784 (send a test ping)